Universal Top-k Keyword Search over Relational Databases
نویسندگان
چکیده
Keyword search is one of the most effective paradigms for information discovery. One of the key advantages of keyword search querying is its simplicity. There is an increasing need for allowing ordinary users to issue keyword queries without any knowledge of the database schema. The retrieval unit of keyword search queries over relational databases is different than in IR systems. While the retrieval unit in those IR systems is a document, in our case, the result is a synthesized document formed by joining a number of tuples. We measure result quality using two metrics: structural quality and content quality. The content quality of a JTT is an IR-style score that indicates how well the information nodes match the keywords, while the structural quality of JTT is a score that evaluates the meaningfulness/semantics of connecting information nodes, for example, the closeness of the corresponding relationship. We design a hybrid approach and develop a buffer system that dynamically maintains a partial data graph in memory. To reuse intermediate results of SQL queries, we break complex SQL queries into two types of simple queries. This allow us to support very large databases and reduce redundant computation. In addition, we conduct extensive experiments on large-scale real datasets to study the performance of the proposed approaches. Experiments show that our approach is better than previous approaches, especially in terms of result quality.
منابع مشابه
Scalable Continual Top-k Keyword Search in Relational Databases
Keyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. Most of existing methods focus on answering snapshot keyword queries in static databases. In practice, however, databases are updated frequently, and users may have long-term in...
متن کاملFuzzy Multi-Join and Top-K Query Model for Search-As-You-Type in Multiple Tables
A search-as-you-type system determines answers on-the-fly as a user types in a keyword query, character by character. There arises a higher need to know the support search-as-you-type on data residing in a relational DBMS. The existing work on keyword query focuses on to support type of search using the native database SQL. The leverage existing database functionalities is to meet high performa...
متن کاملGuest Editors Introduction: Special Section on Keyword Search on Structured Data
WITH the prevalence of Web search engines, keyword search has become the most popular way for users to retrieve information from text documents. On the other hand, there is an enormous amount of valuable information stored in structured form (relational or semistructured) in Internet, intranet, and enterprise databases. To query such data sources, users traditionally depended on specialized app...
متن کاملJoin-Based Algorithms for Keyword Search in XML Databases
We consider the problem of keyword search in XML databases under the excluding lowest common ancestor (ELCA) semantics. Our analysis shows that ELCA semantics may lead to conflict with keyword proximity concept, and under such semantics, lower ELCAs are preferable because lower elements tend to be more specific. However, existing algorithms (stack-based and index-based) do not provide efficient...
متن کاملA Hidden Markov Model Approach to Keyword-Based Search over Relational Databases
We present a novel method for translating keyword queries over relational databases into SQL queries with the same intended semantic meaning. In contrast to the majority of the existing keyword-based techniques, our approach does not require any a-priori knowledge of the data instance. It follows a probabilistic approach based on a Hidden Markov Model for computing the top-K best mappings of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011